Add some new "status quo" atime tests for common operations #4517

jangorecki · 2020-05-30T15:00:47Z

After having discussion with Matt on slack we decided to narrow down scope of #4687. So new benchmarking feature can be more usable and not introduce extra maintenance burden that tracking historical timings, and other features initially listed here, would require.

As a starting point I took system.time tests that have been already taken out from tests.Rraw file (to reduce run time of main test script). Those tests have been moved to a new benchmark() function that meant to replace test() function when system.time is needed.
To keep things simple, we don't need new benchmark.data.table(), as we can just call test.data.table("benchmarks.Rraw") or cc("benchmarks.Rraw") in dev-mode. It will already recognize benchmarks calls in the test script.
This is still very much a starting point so any feedback is very welcome.

Ideas for improvement:

include times argument to run expression multiple times and take mean/median to compare. This will allow to make a more tight tolerance (test 1110).
test for available memory at start and stop early if less than necessary

Initial proposal at bc8a8be

PR brings new set of scripts, and internal functions, to measure performance of data.table. They are not run in any of our workflow as of now, but rather should be run manually. For now there is no point to merge this branch. Opening PR to more easily document and refer to amongst gh issues.

For example, addressing "add timing test for many .SD cols #3797" for which scripts are defined in benchmarks.Rraw file. Yet to close #3797 we need to add a rules to be checked after all benchmarks, to confirm optimize=0 is not that much different than optimize=Inf.

data.table:::benchmark.data.table(libs=list.dirs("library/gcc", recursive=FALSE))
benchmark.data.table() running: benchmarks.Rraw
R_LIBS_USER=library/gcc/O0 R_DATATABLE_NUM_THREADS=1 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
R_LIBS_USER=library/gcc/O0 R_DATATABLE_NUM_THREADS=4 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
...
R_LIBS_USER=library/gcc/O0 R_DATATABLE_NUM_THREADS=40 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
R_LIBS_USER=library/gcc/O0-g R_DATATABLE_NUM_THREADS=1 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
...
R_LIBS_USER=library/gcc/O0-g R_DATATABLE_NUM_THREADS=40 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
R_LIBS_USER=library/gcc/O2 R_DATATABLE_NUM_THREADS=1 R_DATATABLE_NUM_PROCS_PERCENT=100 Rscript inst/benchmarks/benchmarks.Rraw
...

> data.table:::summary.benchmark()[-c(1:3)]
               cflags   num    fun          args         desc    th user_self sys_self elapsed
               <char> <num> <char>        <char>       <char> <int>     <num>    <num>   <num>
 1:               -O0  1.01      [ DT,,.SD,by=st  optimize=0L     1     1.395    0.230   1.625
 2:               -O0  1.02      [ DT,,.SD,by=st optimize=Inf     1     1.600    0.238   1.838
...
85: -O3 -mtune=native  1.01      [ DT,,.SD,by=st  optimize=0L     1     1.336    0.231   1.566
86: -O3 -mtune=native  1.02      [ DT,,.SD,by=st optimize=Inf     1     1.484    0.279   1.763
...

jangorecki · 2020-08-13T10:23:40Z

Should also cover comment in #4666

Possibly higher priority than valgrind is to to run benchmark.Rraw in GLCI. We've had a few performance regressions recently it'd be nice to nail down so they don't come back. It'd be good to add focused low level benchmarks like the [[-by-group to benchmark.Rraw.

codecov · 2020-10-06T08:13:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.77%. Comparing base (8f5ffa8) to head (73efaf3).

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #4517   +/-   ##
=======================================
  Coverage   98.77%   98.77%           
=======================================
  Files          81       81           
  Lines       15203    15203           
=======================================
  Hits        15017    15017           
  Misses        186      186

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

MichaelChirico · 2024-04-12T18:44:18Z

cc @Anirban166 this old draft PR has some potential new benchmarking tests for #6078. If we extract the good tests from here I think we could close this PR too.

MichaelChirico · 2025-07-16T06:11:25Z

@tdhock what would the format be for pulling examples like this into {atime}, exactly? thus far all of our examples are like slow/fast pairs, but this is structured as standalone "status quo" tests we'd like to prevent from slowing in the future.

Would it be appropriate to add them as tests with just one label? "Fast" or "Status quo" maybe?

tdhock · 2025-07-16T19:25:09Z

sure it is possible to omit the Slow and just include Fast (=Status Quo).
Actually you don't have to include any historical references in an atime test, but I think it is probably the best use of our dev time to focus on examples that somebody complained about in the past. (= that person depends on performance for some real application)

github-actions · 2025-07-16T20:45:40Z

HEAD=benchmark stopped early for Serial value setting with :=

Generated via commit 73efaf3

Download link for the artifact containing the test results: ↓ atime-results.zip

Task	Duration
R setup and installing dependencies	2 minutes and 45 seconds
Installing different package versions	20 seconds
Running and plotting the test cases	2 minutes and 37 seconds

MichaelChirico · 2025-07-16T21:43:59Z

@jangorecki I trimmed away a lot of what was here & just pulled out a few common operations that are ubiquitous to serve as baseline performance tests, PTAL.

@tdhock I think since there's no reference to any commit hash in this branch, we can proceed to delete this branch after merging, right?

.ci/atime/tests.R

jangorecki added 8 commits May 28, 2020 21:42

build and install stuff

50e61e0

rename to cc

1c7e6c4

initial internal framework for benchmarking

4421958

forder benchmark

3a55ef0

always run many threads

ea15862

automate cflags

b0db59c

minor improvements

7c76529

allow loop over various R builds

bc8a8be

jangorecki added the WIP label Aug 13, 2020

jangorecki mentioned this pull request Aug 25, 2020

Continuous Benchmarking #4687

Closed

9 tasks

Merge branch 'master' into benchmark

7870a50

simplify benchmarks

a3500bd

jangorecki linked an issue Oct 9, 2020 that may be closed by this pull request

Continuous Benchmarking #4687

Closed

9 tasks

jangorecki added 4 commits October 9, 2020 14:04

remove unused anymore

d40f0cb

reduce diff

bea4feb

roadmap for more checks

4fe8676

fix coverage

39843be

MichaelChirico removed the WIP label Feb 19, 2024

MichaelChirico marked this pull request as draft February 19, 2024 04:19

MichaelChirico mentioned this pull request Mar 13, 2024

Add options= to test(), convert most Rraw scripts #5845

Draft

Merge branch 'master' into benchmark

8ff16c9

MichaelChirico added 2 commits July 16, 2025 14:39

Convert a few to atime tests

9e871d4

revert

97c6ff4

MichaelChirico marked this pull request as ready for review July 16, 2025 21:42

MichaelChirico requested review from Anirban166, MichaelChirico and tdhock as code owners July 16, 2025 21:42

MichaelChirico reviewed Jul 16, 2025

View reviewed changes

.ci/atime/tests.R Outdated Show resolved Hide resolved

missing ','

3b29aa7

MichaelChirico reviewed Jul 16, 2025

View reviewed changes

.ci/atime/tests.R Outdated Show resolved Hide resolved

bad copy-paste

319280b

MichaelChirico reviewed Jul 16, 2025

View reviewed changes

.ci/atime/tests.R Outdated Show resolved Hide resolved

missing 'data.table::'

73efaf3

MichaelChirico changed the title ~~benchmark~~ Add some new "status quo" atime tests for common operations Jul 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add some new "status quo" atime tests for common operations #4517

Add some new "status quo" atime tests for common operations #4517

jangorecki commented May 30, 2020 •

edited

Loading

Uh oh!

jangorecki commented Aug 13, 2020 •

edited

Loading

Uh oh!

codecov bot commented Oct 6, 2020 •

edited

Loading

Uh oh!

MichaelChirico commented Apr 12, 2024

Uh oh!

MichaelChirico commented Jul 16, 2025

Uh oh!

tdhock commented Jul 16, 2025

Uh oh!

github-actions bot commented Jul 16, 2025 •

edited

Loading

Uh oh!

MichaelChirico commented Jul 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add some new "status quo" atime tests for common operations #4517

Are you sure you want to change the base?

Add some new "status quo" atime tests for common operations #4517

Conversation

jangorecki commented May 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jangorecki commented Aug 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MichaelChirico commented Apr 12, 2024

Uh oh!

MichaelChirico commented Jul 16, 2025

Uh oh!

tdhock commented Jul 16, 2025

Uh oh!

github-actions bot commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaelChirico commented Jul 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jangorecki commented May 30, 2020 •

edited

Loading

jangorecki commented Aug 13, 2020 •

edited

Loading

codecov bot commented Oct 6, 2020 •

edited

Loading

github-actions bot commented Jul 16, 2025 •

edited

Loading